1. Field of the Invention
The present invention relates to compression techniques for compressing images including text portions.
2. Description of the Related Art
In recent years, there have been increasing demands for digitalization of documents by scanners which read paper documents and for conversion of formats of electronic documents for information sharing. In general, an image of a page captured by a scanner is constituted by raster data. For a format conversion of an electronic document, a page of the electronic document having a specific format may be converted into raster data. When a paper document is converted into an electronic document, the amount of data becomes large. Therefore, a technique disclosed, for example, in Japanese Patent Laid-Open No. 08-147446 (corresponding to U.S. Pat. No. 5,907,835) is known in which the amount of data is reduced by JPEG compression.
JPEG compression is suitable for compression of multilevel images such as photographs but is not suitable for compression of text portions. Japanese Patent Laid-Open No. 2003-018413 (corresponding to U.S. Patent Application Publication No. 2002/0037100) discloses a method for effectively compressing images including a multilevel image and a text portion.
Partial-page image data, which is part of data constituting one page, can contain various sizes of characters in an image constituted by the partial-page image data. However, techniques for effectively recognizing such characters included in partial-page images have not been established in the related art. Therefore, image compression of data according to recognition results of the technique according to the related art does not work effectively.